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Abstract. Boolean satisfiability problems are an important benchmark for questions about complexity, algo- 
rithms, heuristics and threshold phenomena. Recent work on heuristics, and the satisfiability threshold has cen- 
tered around the structure and connectivity of the solution space. Motivated by this work, we study structural and 
connectivity-related properties of the space of solutions of Boolean satisfiability problems and estabUsh various 
dichotomies in Schaefer's framework. 

On the structural side, we obtain dichotomies for the kinds of subgraphs of the hypercube that can be induced 
by the solutions of Boolean formulas, as well as for the diameter of the connected components of the solution 
space. On the computational side, we establish dichotomy theorems for the complexity of the connectivity and st- 
connectivity questions for the graph of solutions of Boolean formulas. Our results assert that the intractable side 
of the computational dichotomies is PSPACE-complete, while the tractable side - which includes but is not limited 
to all problems with polynomial time algorithms for satisfiability - is in P for the st-connectivity question, and in 
coNP for the connectivity question. The diameter of components can be exponential for the PSPACE-complete 
cases, whereas in all other cases it is linear; thus, small diameter and tractability of the connectivity problems are 
remarkably aligned. The crux of our results is an expressibility theorem showing that in the tractable cases, the 
subgraphs induced by the solution space possess certain good structural properties, whereas in the intractable cases, 
the subgraphs can be arbitrary. 
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1. Introduction. In 1978, T.J. Schaefer II22J introduced a rich framework for expressing 
variants of Boolean satisfiability and proved a remarkable dichotomy theorem: the satisfia- 
bility problem is in P for certain classes of Boolean formulas, while it is NP-complete for 
all other classes in the framework. In a single stroke, this result pinpoints the computational 
complexity of all well-known variants of Sat, such as 3-Sat, HORN 3-Sat, Not-All- 
Equal 3-S AT, and l-lN-3 Sat. Schaefer's work paved the way for a series of investigations 
establishing dichotomies for several aspects of satisfiability, including optimization 16 , 8„ .14i . 
counting Q, inverse satisfiability ifTSl . minimal satisfiability lITSl . 3- valued satisfiability |I5] 
and propositional abduction f91. 

Our aim in this paper is to carry out a comprehensive exploration of a different aspect of 
Boolean satisfiability, namely, the connectivity properties of the space of solutions of Boolean 
formulas. The solutions (satisfying assignments) of a given n-variable Boolean formula (p 
induce a subgraph G((p) of the n-dimensional hypercube. Thus, the following two decision 
problems, called the connectivity problem and the st- connectivity problem, arise naturally: 
(i) Given a Boolean formula (f, is G((p) connected? (ii) Given a Boolean formula and two 
solutions s and t of ^p, is there a path from s to t in G ((/?)? 

We believe that connectivity properties of Boolean satisfiability merit study in their own 
right, as they shed light on the structure of the solution space of Boolean formulas. More- 
over, in recent years the structure of the space of solutions for random instances has been 
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the main consideration at the basis of both algorithms for and mathematical analysis of the 
satisfiability problem ||2] [21] |20] [TS] . It has been conjectured for 3-Sat ||201 and proved for 
8-Sat 1 19 3 1, that the solution space fractures as one approaches the critical region from 
below. This apparently leads to performance deterioration of the standard satisfiability algo- 
rithms, such as WaUcSAT ll23l and DPLL |[T]. It is also the main consideration behind the 
design of the survey propagation algorithm, which has far superior performance on random 
instances of satisfiability |20|. This body of work has served as a motivation to us for pursu- 
ing the investigation reported here. While there has been an intensive study of the structure of 
the solution space of Boolean satisfiability problems for random instances, our work seems 
to be the first to explore this issue from a worst-case viewpoint. 

Our first main result is a dichotomy theorem for the st-connectivity problem. This result 
reveals that the tractable side is much more generous than the tractable side for satisfiability, 
while the intractable side is PSPACE-complete. Specifically, Schaefer showed that the sat- 
isfiability problem is solvable in polynomial time precisely for formulas built from Boolean 
relations all of which are bijunctive, or all of which are Horn, or all of which are dual Horn, 
or all of which are affine. We identify new classes of Boolean relations, called tight rela- 
tions, that properly contain the classes of bijunctive, Horn, dual Horn, and affine relations. 
We show that si-connectivity is solvable in linear time for formulas built from tight relations, 
and PSPACE-complete in all other cases. Our second main result is a dichotomy theorem for 
the connectivity problem: it is in coNP for formulas built from tight relations, and PSPACE- 
complete in all other cases. 

In addition to these two complexity-theoretic dichotomies, we establish a structural di- 
chotomy theorem for the diameter of the connected components of the solution space of 
Boolean formulas. This result asserts that, in the PSPACE-complete cases, the diameter of 
the connected components can be exponential, but in all other cases it is linear. Thus, small 
diameter and tractability of the si-connectivity problem are remarkably aligned. 

To establish these results, the main challenge is to show that for non-tight relations, both 
the connectivity problem and the si-connectivity problem are PSPACE-hard. In Schaefer's 
Dichotomy Theorem, NP-hardness of satisfiability was a consequence of an expressibility 
theorem, which asserted that every Boolean relation can be obtained as a projection over 
a formula built from clauses in the "hard" relations. Schaefer's notion of expressibility is 
inadequate for our problem. Instead, we introduce and work with a delicate and stricter no- 
tion of expressibility, which we call faithful expressibility. Intuitively, faithful expressibility 
means that, in addition to definability via a projection, the space of witnesses of the existential 
quantifiers in the projection has certain strong connectivity properties that allow us to cap- 
ture the graph structure of the relation that is being defined. It should be noted that Schaefer's 
Dichotomy Theorem can also be proved using a Galois connection and Post's celebrated clas- 
sification of the lattice of Boolean clones (see |4|). This method, however, does not appear to 
apply to connectivity, as the boundaries discovered here cut across Boolean clones. Thus, the 
use of faithful expressibility or some other refined definability technique seems unavoidable. 

The first step towards proving PSPACE-completeness is to show that both connectivity 
and st-connectivity are hard for 3-CNF formulae; this is proved by a reduction from a generic 
PSPACE computation. Next, we identify the simplest relations that are not tight: these are 
ternary relations whose graph is a path of length 4 between assignments at Hamming distance 
2. We show that these paths can faithfully express all 3-CNF clauses. The crux of our 
hardness result is an expressibility theorem to the effect that one can faithfully express such a 
path from any set of relations which is not tight. 

Finally, we show that all tight relations have "good" structural properties. Specifically, in 
a tight relation every component has a unique minimum element, or every component has a 



THE CONNECTIVITY OF BOOLEAN SATISFIABILITY 



3 



unique maximum element, or the Hamming distance coincides with the shortest-path distance 
in the relation. These properties are inherited by every formula built from tight relations, and 
yield both small diameter and linear algorithms for st-connectivity. 

Our original hope was that tractability results for connectivity could conceivably inform 
heuristic algorithms for satisfiability and enhance their effectiveness. In this context, our 
findings are prima facie negative: we show that when satisfiability is intractable, then con- 
nectivity is also intractable. But our results do contain a glimmer of hope: there are broad 
classes of intractable satisfiability problems, those built from tight relations, with polynomial 
si-connectivity and small diameter It would be interesting to investigate if these properties 
make random instances built from tight relations easier for WalkSAT and similar heuristics, 
and if so, whether such heuristics are amenable to rigorous analysis. 

An extended abstract of this paper appears in ICALP'06 fTOl . 

2. Basic Concepts and Statements of Results. A CNF formula is a Boolean formula 
of the form Ci A • • ■ A C„, where each Ci is a clause, i.e., a disjunction of literals. If /c is a 
positive integer, then a /c-CNF formula is a CNF formula Ci A • ■ • A C„ in which each clause 
Ci is a disjunction of at most k literals. 

A logical relation i? is a non-empty subset of {0, l}'^', for some k > 1; /c is the arity 
of R. Let iS be a finite set of logical relations. A CNF {S)-formula over a set of variables 
V = {xi, . . . , Xn} is a finite conjunction Ci A • • • A C„ of clauses built using relations from 
S, variables from V, and the constants and 1; this means that each Ci is an expression of the 
form , . . . , ), where i? £ 5 is a relation of arity k, and each is a variable in V or one 
of the constants 0, 1. A solution of a CNF(tS)-formula ip is an assignment s = (ai, . . . , a„) 
of Boolean values to the variables that makes every clause of (p true. A CNF(iS)-formula is 
satisfiable if it has at least one solution. 

The satisfiability problem Sat(iS) associated with a finite set S of logical relations asks: 
given a CNF(iS)-formula pi, is it satisfiable? All well known restrictions of Boolean satis- 
fiabihty, such as 3-Sat, Not-All-Equal 3-Sat, and Positive I-in-3 Sat, can be cast 
as Sat(iS) problems, for a suitable choice of S. For instance, let i?o = {0, 1}'^\{000}, 
i?i = {0, 1}3\{100}, i?2 = {0,1}3\{110}, i?3 = {0, 1}3\{111}. Then 3-Sat is the 
problem SAT({i?o, -Ri, -R2, ^s})- Similarly, POSITIVE 1-IN-3SAT is SAT({i?i/3}), where 
i?i/3 = {100,010,001}. 

Schaefer 1221 identified the complexity of every satisfiability problem Sat(iS), where S 
ranges over all finite sets of logical relations. To state Schaefer's main result, we need to 
define some basic concepts. 

Definition 2.1. Let Rbe a logical relation. 

1. R is bijunctive if it is the set of solutions of a 2-CNF formula. 

2. R is Horn if it is the set of solutions of a Horn formula, where a Horn formula is a 
CNF formula such that each conjunct has at most one positive literal. 

3. R is dual Horn if it is the set of solutions of a dual Horn formula, where a dual Horn 
formula is a CNF formula such that each conjunct has at most one negative literal. 

4. R is affine if it is the set of solutions of a system of linear equations over Z2. 
Each of these types of logical relations can be characterized in terms of closure properties 

ll22l . A relation R is bijunctive if and only if it is closed under the majority operation; this 
means that if a, b, c E R, then maj(a, b, c) e R, where maj(a, b, c) is the vector whose 
i-th bit is the majority of ai, hi, Ci. A relation R is Horn if and only if it is closed under V; this 
means that if a, b G R, then a V b G i?, where, a V b is the vector whose i-th bit is ai W hi. 
Similarly, R is dual Horn if and only if it is closed under A. Finally, R is affine if and only 
if it is closed under a © b © c. Thus there is a polynomial-time algorithm (in fact, a cubic 
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algorithm) to test if a relation is Schaefer. 

Definition 2.2. A set S of logical relations is Schaefer if at least one of the following 
conditions holds: 

1. Every relation in S is bijunctive. 

2. Every relation in S is Horn. 

3. Every relation in S is dual Horn. 

4. Every relation in S is ajfine. 

Theorem 2.3. (Schaefer's Dichotomy Theorem ||22|) Let S be a finite set of logical 
relations. If S is Schaefer, then Sat(iS) is in P; otherwise, Sat(5) is NP-complete. 

Theorem |2.3| is called a dichotomy theorem because Ladner [16] has shown that if P 7^ 
NP, then there are problems in NP that are neither in P, nor NP-complete. Thus, Theorem l2.3l 
asserts that no Sat (5) problem is a problem of the kind discovered by Ladner Note that the 
aforementioned characterization of Schaefer sets in terms of closure properties yields a cubic 
algorithm for determining, given a finite set S of logical relations, whether Sat (5) is in P or 
is NP-complete (here, the input size is the sum of the sizes of the relations in S). 

The more difficult part of the proof of Schafer's Dichotomy Theorem is to show that if 
S is not Schaefer, then Sat(iS) is NP-complete. This is a consequence of a powerful result 
about the expressibility of logical relations. We say that a relation R is expressible from a set 
S of relations if there is a CNF(4S)-formula (^(x, y) such that R = {a|3y ip{a, y)}. 

Theorem 2.4. (Schaefer's Expressibility Theorem ll22l ) Let S be a finite set of logical 
relations. If S is not Schaefer, then every logical relation is expressible from S. 

In this paper, we are interested in the connectivity properties of the space of solutions 
of CNF(iS)-formulas. If is a CNF (5) -formula with n variables, then the solution graph 
G{(p) of (f denotes the subgraph of the n-dimensional hypercube induced by the solutions of 
ip. This means that the vertices of G{ip) are the solutions of (p, and there is an edge between 
two solutions of G{(p) precisely when they differ in exactly one variable. 

We consider the following two algorithmic problems for CNF(5)-formulas. 

Problem 1. The Connectivity Problem Conn(5): 

Given a CNF(5)-formula (p, is G{(p) connected? 

Problem 2. The st-Connectivity Problem ST-Conn(5): 

Given a CNF(5)-formula (f and two solutions s and t of (p, is there a path from s to t in 

To pinpoint the computational complexity of Conn(iS) and ST-Conn(5), we need to 
introduce certain new types of relations. 

Definition 2.5. Let R c {0, l}'"' be a logical relation. 

1. R is componentwise bijunctive if every connected component of the graph G{R) is 
a bijunctive relation. 

2. R is OR-free if the relation OR = {01, 10, 11} cannot be obtained from R by setting 
k — 2 of the coordinates of R to a constant c G {0, 1}*^^^. In other words, R is OR- 
free if (xi V a;2) is not definable from R by fixing k — 2 variables. 

3. R is NAND-free if the relation NAND = {00, 01, 10} cannot be obtained from R 
by setting k — 2 of the coordinates of R to a constant c e {0, 1}*^^^. In other words, 
R is NAND-/ree is (xi V 2:2) is not definable from R by fixing fc — 2 variables. 

We are now ready to introduce the key concept of a tight set of relations. 
Definition 2.6. A set S of logical relations is tight if at least one of the following three 
conditions holds: 

1. Every relation in S is componentwise bijunctive; 

2. Every relation in S is OR-free; 

3. Every relation in S is NAND-/ree. 
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In Section m we show that if S is Schaefer, then it is tight. Moreover, we show that the 
converse does not hold. It is also easy to see that there is a polynomial-time algorithm (in 
fact, a cubic algorithm) for testing whether a given relation is tight. 

Just as Schaefer's dichotomy theorem follows from an expressibility statement, our di- 
chotomy theorems are derived from the following theorem, which we will call the Faithful 
Expressibility Theorem. The precise definition of the concept of faithful expressibility is 
given in Section |3] Intuitively, this concept strengthens the concept of expressibility with the 
requirement that the space of the witnesses to the existentially quantified variables has certain 
strong connectivity properties. 

Theorem 2.7. (Faithful Expressibility Theorem) Let S be a finite set of logical rela- 
tions. If S is not tight, then every logical relation is faithfully expressible from S. 

Using the Faithful Expressibility Theorem, we obtain the following dichotomy theorems 
for the computational complexity of Conn(5) and ST-Conn(5). 

Theorem 2.8. Let S be a finite set of logical relations. If S is tight, then CONN(iS) is 
in coNP; otherwise, it is ¥S¥hCE-complete. 

Theorem 2.9. Let S be a finite set of logical relations. If S is tight, then ST-Conn(5) 
is in P; otherwise, ST-Conn(5) is PSPACE-complete. 

We also show that if S is tight, but not Schaefer, then Conn(iS) is coNP-complete. 

To illustrate these results, consider the set S = {Ri/^}, where -R1/3 — {100, 010, 001}. 
This set is tight (actually, it is componentwise bijunctive), but not Schaefer. It follows that 
Sat(5) is NP-complete (recall that this problem is POSITIVE l-iN-3 Sat), st-Conn(5) is 
in P, and Conn(iS) is coNP-complete. Consider also the set S — {i?NAE}, where i?NAE = 
{0, 1}^ \ {000, 111}. This set is not tight, hence Sat(5) is NP-complete (this problem is 
Positive Not- All-Equal 3-Sat), while both st-Conn(5) and Conn(5) are PSPACE- 
complete. 

The dichotomy in the computational complexity of Conn(iS) and ST-Conn(iS) is ac- 
companied by a parallel structural dichotomy in the size of the diameter of G{lp) (where, for 
a CNF(5)-formula if, the diameter ofG{(p) is the maximum of the diameters of the compo- 
nents of G(ly9)). 

Theorem 2.10. Let S be a finite set of logical relations. If S is tight, then for every 
CNF (S) -formula ip, the diameter of G{ip) is linear in the number of variables of Lp; oth- 
erwise, there are CNF (S) -formulas ip such that the diameter of G{ip) is exponential in the 
number of variables of ip. 

Our results and their comparison to Schaefer's Dichotomy Theorem are summarized in 
the table below. 



<s 


Sat (5) 


st-Conn(<S) 


Conn(5) 


Diameter 


Schaefer 

Tight, non-Schaefer 
Non-tight 


P 

NP-compl. 
NP-compl. 


P 
P 

PSPACE-compl. 


coNP 

coNP-compl. 
PSPACE-compl. 


0{n) 

0{n) 
2^2(^A^) 



We conjecture that the complexity of Conn(iS) exhibits a trichotomy, that is, for every 
finite set S of logical relations, one of the following holds: 

1. Conn(5) isinP; 

2. Conn(<S) is coNP-complete; 

3. Conn(5) is PSPACE-complete. 

As mentioned above, we will show that if S is tight but not Schaefer, then Conn(iS) 
is coNP-complete. We will also show that if S is bijunctive or affine, then Conn(4S) is in 
P. Hence, to settle the above conjecture, it remains to pinpoint the complexity of CONN(iS) 
whenever S is Horn and whenever S is dual Horn. In the conference version ifTOl of the 
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present paper, we further conjectured that if S is Horn or dual Horn, then Conn(iS) is in 
P. In other words, we conjectured that if S is Schaefer, then CONN(iS) is in P. This second 
conjecture, however, was subsequently disproved by Makino, Tanaka and Yamamato IITtI . 
who discovered a particular Horn set S such that Conn(iS) is coNP-complete. Here, we 
go beyond the results obtained in the conference version of the present paper and identify 
additional conditions on a Horn set S implying that Conn(iS) is in P. These new results 
suggest a natural dichotomy within Schaefer sets of relations and, thus, provide evidence for 
the trichotomy conjecture. 

The remainder of this paper is organized as follows. In Section[3] we prove the Faithful 
Expressibility Theorem, establish the hard side of the dichotomies for Conn(5) and for 
ST-CONN(iS), and contrast our result to Schaefer's Expressibility and Dichotomy Theorems. 
In Section |4] we describe the easy side of the dichotomy - the polynomial-time algorithms 
and the structural properties for tight sets of relations. In addition, we obtain partial results 
towards the trichotomy conjecture for Conn(iS). 

3. The Hard Case of the Dichotomy: Non-Tight Sets of Relations. In this section, 
we address the hard side of the dichotomy, where we deal with the more computationally 
intractable cases. As with other dichotomy theorems, this is also the harder part of our proof. 
We define the notion of faithful expressibility in Section lTTI and prove the Faithful Express- 
ibility Theorem in Section [J!2l This theorem implies that for all non-tight sets S and S', the 
connectivity problems Conn(iS) and CONN(tS') are polynomial-time equivalent; moreover, 
the same holds true for the connectivity problems ST-Conn(iS) and ST-Conn(5'). In addi- 
tion, the diameters of the solution graphs of CNF(5)-formulas and CNF(5')-formulas are 
also related polynomially. In Section l331 we prove that for 3-CNF formulas the connectivity 
problems are PSPACE-complete, and the diameter can be exponential. This fact combined 
with the Faithful Expressibility Theorem yields the hard side of all of our dichotomy results, 
as well as the exponential size of the diameter 

We will use a, b, . . . to denote Boolean vectors, and x and y to denote vectors of vari- 
ables. We write |a| to denote the Hamming weight (number of I's) of a Boolean vector a. 
Given two Boolean vectors a and b, we write |a — b| to denote the Hamming distance be- 
tween a and b. Finally, if a and b are solutions of a Boolean formula (p and lie in the same 
component of G{(p), then we write d^p{a., b) to denote the shortest-path distance between a 
and b in G{ip). 

3.1. Faithful Expressibility. As we mentioned in the previous section, in his dichotomy 
theorem, Schaefer |22| used the following notion of expressibility: a relation R is expressible 
from a set S of relations if there is a CNF(iS)-formula ip so that R ~ {a| iEly (p{a,y)}. 
This notion, is not sufficient for our purposes. Instead, we introduce a more delicate notion, 
which we call faithful expressibility. Intuitively, we view the relation i? as a subgraph of the 
hypercube, rather than just a subset, and require that this graph structure be also captured by 
the formula ip. 

Definition 3. 1 . A relation R is faithfully expressible/rom a set of relations S if there 
is a CNF (S) -formula ip such that the following conditions hold: 

1. i? = {a|3y^(a,y)}; 

2. For every a £ i?, the graph G{p{el, y)) is connected; 

3. For a,h £ R with |a — b| = 1, there exists w such that (a, w) and (b,w) are 
solutions of p. 

For a G R, the witnesses of a are the y's such that ip{a, y) is true. The last two conditions 
say that the witnesses of a £ i? are connected, and that neighboring a, b G i? have a common 
witness. This allows us to simulate an edge (a, b) in G{R) by a path in G{(p), and thus relate 
the connectivity properties of the solution spaces. There is however, a price to pay: it is much 
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Fig. 3.1. Expressing the relation (x\ V X2 V x-^) using NOT- All-Eqv AL relations. 

(a) Graph of{xi V X2 V x^); 

(b) Graph of a faithful expression: ip{x,yi,y2) = i?NAE(a;i , x-2, j/i) A i?NAE(a:2, 2:3, j/2) A Rnae , 4/2 , !)• 

(c) Graph of an unfaithful expression: ip{x, y\) = i?NAE(2^ii X2,yi) A iJNAE i ^3 1 0) A RnaeC^Ii 2:2, !)• 
In both cases {xi V X2 V xg) = 3y </'(x, y), but only in the first case the connectivity is preserved. 



harder to come up with formulas that faithfully express a relation R. An example is when S 
is the set of all paths of length 4 in {0, l}'^, a set that plays a crucial role in our proof. While 
3-Sat relations are easily expressible from S in Schaefer's sense, the CNF (5) -formulas that 
faithfully express 3-Sat relations are fairly complicated and have a large witness space. 

An example of the difference between a faithful and an unfaithful expression is shown in 
Figure im 

Lemma 3.2. Let S and S' be sets of relations such that every R d S' is faithfully 
expressible from S. Given a CNF {S') -formula ■0(x), one can efficiently construct a CNF(iS)- 
formula 93 (x, y) such that: 

1. = 3y (p{x,y); 

2. if {s, w^), (t, w*) G ip are connected in G{ip) by a path of length d, then there is a 
path from s to t in G{i})) of length at most d; 

3. If s, t £ are connected in G{'ip), then for every witness ofs, and every witness 
w* oft, there is a path from (s, w**) to (t, w*) in G{(p). 

Proof Suppose ijj is a formula on n variables that consists of m clauses Ci, . . . , Cm. 
For clause Gj, assume that the set of variables is Vj C [n], and that it involves relation 
Rj G S. Thus, is AjL]^i?j(xv,. ). Let ipj be the faithful expression for Rj from S', so 
that Rji^Vj) = 3yj ip-j{xv^ ,yj). Let y be the vector (yi, . . .,ym) and let (p(x,y) be the 
formula A]Lj^(p-i{xv, , yj). Then ■(/'(x) = 3y y). 

Statement (2) follows from (1) by projection of the path on the coordinates of x. For 
statement (3), consider s,t G that are connected in G(V') via a path s = u" ^ u-*^ ^ 
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• ■ • — > u'' = t . For every u',u'+-'^, and clause Cj, there exists an assignment w'^ to 
such that both {u\.,w^j) and (u'+^y , w'^) are solutions of (pj, by condition (2) of 
faithful expressibility. Thus (u', w') and (u'"*"-*^, w') are both solutions of (p, where w' = 
(w'l, . . . w'm)- Further, for every u\ the space of solutions of (p{u\ y) is the product space 
of the solutions of ipj (u Vj , Yj ) over j — 1 , . . . , m. Since these are all connected by condition 
(3) of faithful expressibility, G{(p{u\ y)) is connected. The following describes a path from 
(s,w'') to (t,w*) in G{ip): {s,w^) (s,w°) (u^,w°) (u^,w^) ■ ■ ■ 
(u"""-^, w"""-^) (t, w"""-^) (t, w*). Here ^ indicates a path in G{ip{u\y)). □ 

Corollary 3.3. Suppose S and S' are sets of relations such that every i? G 5' is 
faithfully expressible from S. 

1. There are polynomial time reductions from Conn(iS') to Conn(5), and from ST- 
Conn(5') to st-Conn(5). 

2. Given a CNF {S') -formula '(/'(x) with m clauses, one can efficiently construct a 
CH¥(S)-formula (/5(x, y) such that the length ofy is 0{m) and the diameter of the 
solution space does not decrease. 

3.2. The Faithful ExpressibiUty Theorem. In this subsection, we prove the Faithful 
Expressibility Theorem. The main step in the proof is Lemma[33]which shows that if S is not 
tight, then we can faithfully express the 3-clause relations from the relations in 5. If A: > 2, 
then a k-clause is a disjunction of k variables or negated variables. For < « < A:, let Di 
be the set of all satisfying truth assignments of the fc-clause whose first i literals are negated, 
and let Sk ~ {Do, Di, . . . , Dk}- Thus, CNF(5/c) is the collection of fc-CNF formulas. 

Lemma 3.4. If set S of relations is not tight, S3 is faithfully expressible from S. 

Proof. First, observe that all 2-clauses are faithfully expressible from S. There exists 
R G S which is not OR-free, so we can express {xi V 2:2) by substituting constants in R. 
Similarly, we can express {xi V X2) using a relation that is not NAND-free. The last 2-clause 
{xi y X2) can be obtained from OR and NAND by a technique that corresponds to reverse 
resolution, {xi V X2) = 3t/ {xi V y) A {y V X2). It is easy to see that this gives a faithful 
expression. From here onwards we assume that S contains all 2-clauses. The proof now 
proceeds in four steps. First, we will express a relation in which there exist two elements that 
are at graph distance larger than their Hamming distance. Second, we will express a relation 
that is just a single path between such elements. Third, we will express a relation which is 
a path of length 4 between elements at Hamming distance 2. Finally, we will express the 
3-clauses. 

Step 1 . Faithfully expressing a relation in which some distance expands. For a relation 
R, we say that the distance between a and b expands if a and b are connected in G{R), but 
dfl(a, b) > |a — b|. Later on, we will show that no distance expands in componentwise 
bijunctive relations. The same also holds true for the relation i?NAE = {0, 1}^ \ {000, 111}, 
which is not componentwise bijunctive. Nonetheless, we show here that if R is not compo- 
nentwise bijunctive, then, by adding 2-clauses, we can faithfully express a relation Q in which 
some distance expands. For instance, when R = i?NAE, then we can take Q{xi,X2, X3) = 
Rnae{xi, X2, X3) A (afi V X3). The distance between a = 100 and b — 001 in Q expands. 
Similarly, in the general construction, we identify a and b on a cycle, and add 2-clauses that 
eliminate all the vertices along the shorter arc between a and b. 

Since S is not tight, it contains a relation R which is not componentwise bijunctive. If 
R contains a, b where the distance between them expands, we are done. So assume that for 
all a, b S G{R), dfj{a, b) = |a — b|. Since R is not componentwise bijunctive, there exists 
a triple of assignments a, b, c lying in the same component such that maj(a, b, c) is not in 
that component (which also easily implies it is not in R). Choose the triple such that the sum 
of pairwise distances dR{a,h) + dfi{h,c) + dii{c,a.) is minimized. Let U = {i\ai ^ hi}, 
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RnAe{Xi, X2, Xs) Rnae{xi,X2: X'i) A [xi V X2) 

Fig. 3.2. Proof of Step\l\of Lemma UAl and an example. 

V — {i\hi ^ Ci}, and W — {i\ci ^ a^}. Since dfl;(a, b) = |a — b|, a shortest path does not 
flip variables outside of U, and each variable in U is flipped exactly once. The same holds 
for V and W . We note some useful properties of the sets [/, V, W . 

1. Every index i ^ U iJV UW occurs in exactly two ofU, V, W. 

Consider going by a shortest path from a to b to c and back to a. Every i e [/ U 
F U is seen an even number of times along this path since we return to a. It is 
seen at least once, and at most thrice, so in fact it occurs twice. 

2. Every pairwise intersection U r\V,V ClW and W HU is non-empty. 

Suppose the sets U and V are disjoint. From Property 1, we must have W = UUV. 
But then it is easy to see that maj(a, b, c) = b which is in R. This contradicts the 
choice of a, b, c. 

3. The sets U HV and U ClW partition the set U . 

By Property 1, each index of U occurs in one of V and W as well. Also since no 
index occurs in all three sets U, V, W this is in fact a disjoint partition. 

4. For each index i G t/ fl W, it holds that a © ^ R. 

Assume for the sake of contradiction that a' = a © G i?. Since i ^ U O W 
we have simultaneously moved closer to both b and c. Hence we have dR{a' , b) + 
dfl(b,c) +dfl(c, a') < dij{a,b) +d_R(b,c) + dn{c,a). Also maj(a',b,c) = 
maj(a, b, c) ^ R. But this contradicts our choice of a, b, c. 

Property 4 implies that the shortest paths to b and c diverge at a, since for any shortest 
path to b the first variable flipped is from U r\V whereas for a shortest path to c it is from 
W OV. Similar statements hold for the vertices b and c. Thus along the shortest path from 
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a to b the first bit flipped is from U OV and the last bit flipped is from U fl W. On the other 
hand, if we go from a to c and then to b, all the bits from U OW are flipped before the bits 
from UnV. We use this crucially to define Q. We will add a set of 2-clauses that enforce the 
following rule on paths starting at a: Flip variables from UOW before variables from Ur\V. 
This will eliminate all shortest paths from a to b since they begin by flipping a variable in 
U f^V and end with U fl W . The paths from a to b via c survive since they flip U r\W while 
going from a to c and U r\V while going from c to b. However all remaining paths have 
length at least |a — b| + 2 since they flip twice some variables not in U . 

Take all pairs of indices S U {^W^j G U {^V}. The following conditions hold 

from the definition of C/, V, W: ai = Ci = hi and aj = Cj = bj. Add the 2-clause Cij assert- 
ing that the pair of variables XiXj must take values in {aiUj , CiCj , bfij } = {aiUj , aiUj , aiCij}. 
The new relation is Q = i? Aij Cij. Note that Q C R. We verify that the distance between a 
and b in Q expands. It is easy to see that for any j e U, the assignment a © ^ Q. Hence 
there are no shortest paths left from a to b. On the other hand, it is easy to see that a and b 
are still connected, since the vertex c is still reachable from both. 

Step 2. Isolating a pair of assignments whose distance expands. The relation Q 
obtained in Step[T]may have several disconnected components. This cleanup step isolates a 
single pair of assignments whose distance expands. By adding 2-clauses, we show that one 
can express a path of length r + 2 between assignments at distance r. 

Take a, b G Q whose distance expands in Q and dQ(a, b) is minimized. Let U = 
{i\ai 7^ hi}, and \U\ ~ r. Shortest paths between a and b have certain useful properties: 

1. Each shortest path flips every variable from U exactly once. 

Observe that each index j £ J7 is flipped an odd number of times along any path 
from a to b. Suppose it is flipped thrice along a shortest path. Starting at a and 
going along this path, let b' be the assignment reached after flipping j twice. Then 
the distance between a and b' expands, since j is flipped twice along a shortest path 
between them in Q. Also c?Q(a, b') < c?Q(a, b), contradicting the choice of a and 
b. 

2. Every shortest path flips exactly one variable i ^ U. 

Since the distance between a and b expands, every shortest path must flip some 
variable i ^ U. Suppose it flips more than one such variable. Since a and b agree 
on these variables, each of them is flipped an even number of times. Let i be the 
first variable to be flipped twice. Let b' be the assignment reached after flipping i 
the second time. It is easy to verify that the distance between a and b' also expands, 
butdQ(a, b') < dQ(a, b). 

3. The variable i ^ U is the first and last variable to be flipped along the path. Assume 
the first variable flipped is not i. Let a' be the assignment reached along the path 
before we flip i the first time. Then c?Q(a', b) < dQ{a., b). The distance between a' 
and b expands since the shortest path between them flips the variables i twice. This 
contradicts the choice of a and b. Assume j G C/ is flipped twice. Then as before 
we get a pair a', b' that contradict the choice of a, b. 

Every shortest path between a and b has the following structure: first a variable i ^ U 
is flipped to a^, then the variables from U are flipped in some order, finally the variable i is 
flipped back to a^. 

Different shortest paths may vary in the choice of i ^ U in the first step and in the 
order in which the variables from U are flipped. Fix one such path T C Q. Assume that 
[/ = {1, . . . , r} and the variables are flipped in this order, and the additional variable flipped 
twice is r + 1. Denote the path by a ^ u" ^ ^ • • • ^ u'' ^ b. Next we prove that we 
cannot flip the r + 1*'* variable at an intermediate vertex along the path. 
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4 For 1 < J < r — 1 the assignment ® Gr+i ^ Q. 

Suppose that for some j, we have c = © e^+i G Q. Then c differs from a on 
{1, . . . , i} and from b on {i + 1, . . . , r}. The distance from c to at least one of a or 
b must expand, else we get a path from a to b through c of length |a — b| which 
contradicts the fact that this distance expands. However dqia, c) and c?Q(b, c) are 
strictly less than (iQ(a, b) so we get a contradiction to the choice of a, b. 

We now construct the path of length r + 2. For alH > r + 2 we set Xi = Ui to get 
a relation on r + 1 variables. Note that b — di . . . 0^0^+1. Take i < j ^ U. Along 
the path T the variable i is flipped before j so the variables XiXj take one of three values 
{aiUj, ajflj, didj}. So we add a 2-clause that requires XiXj to take one of these values 
and take T = Q Aij Cij. Clearly, every assignment along the path lies in T. We claim 
that these are the only solutions. To show this, take an arbitrary assignment c satisfying the 
added constraints. If for some i < j < r we have — but Cj — dj, this would violate 
dj. Hence the first r variables of c are of the form di . . . diUi+i . . . for < z < r. If 
Cr+i — dr+i then c = u*. If Cr+i = flr+i then c = © e^+i. By property 4 above, such 
a vector satisfies Q if and only if i = 01 i = r, which correspond to c = a and c = b 
respectively. 

Step 3. Faithfully expressing paths of length 4. Let V denote the set of all ternary 
relations whose graph is a path of length 4 between two assignments at Hamming distance 2. 
Up to permutations of coordinates, there are 6 such relations. Each of them is the conjunction 
of a 3-clause and a 2-clause. For instance, the relation M = {100, 110, 010, Oil, 001} can be 
written as {xi Va;2 Vxa) A {xi Vxs). (It is named so, because its graph looks like the letter 'M' 
on the cube.) These relations are "minimal" examples of relations that are not componentwise 
bijunctive. By projecting out intermediate variables from the path T obtained in Step|2l we 
faithfully express one of the relations in V. We faithfully express other relations in V using 
this relation. 

We will write all relations in P in terms of M(xi,X2,X3) = {xi V 2:2 V X3) A {xi V 
X3), by negating variables. For example M{xi,X2,X3) = {xi V a;2 V x^) A {xi V X3) = 
{000,010,110,111,101}. 

Define the relation P{xi, Xr+i,X2) = ■ ■ - Xr T{xi, . . . , Xr+i)- The table below 
listing all tuples in P and their witnesses, shows that the conditions for faithful expressibility 
are satisfied, and P E P. 
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Let P{xi, X2,xs) = M{li, I2, 13), where li is one of {xi,Xi}. We can now use P and 
2-clauses to express every other relation in P. Given M{li, I2, Is) every relation in P can 
be obtained by negating some subset of the variables. Hence it suffices to show that we 
can express faithfully M{li, I2, Is) and M{li, I2, Is) (M is symmetric in xi and 2:3). In the 
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following let A denote one of the literals {y, y}, such that it is y if and only if li is Xi. 

MihMM) = (^1 V ^2 V ^3) A (^1 V li) 

= 3y {h V A) A (A V ^2 V ^3) A {h V I3) 

= 3y {h V A) A (A V /2 V h) A {h V [3) A (A V [3) 

= 3y ihy~X)Aihyh)AM{X,l2j3) 

= 3y {h V A) A {h V k) A P{y, a;2, X3) 

In the second step the clause (A V I3) is implied by the resolution of the clauses (/i V A) A 
ih V [3). 

For the next expression let A denote one of the hterals {y, y}, such that it is negated if 
and only if I2 is X2- 

M{lij2, h) = {h V ^2 V ^3) A (Fi V k) 

= 3y {h V /3 V A) A (A V h) A {h V [3) 
= 3y{\Wl2)AMih,\j3) 
^3y (Xyk) AP{xi,y,X3) 

The above expressions are both based on resolution and it is easy to check that they satisfy 
the properties of faithful expressibility. 

Step 4. Faithfully expressing S3. We faithfully express {xi V 2:2 V X3) from M using 
a formula derived from a gadget in fTT]. This gadget expresses {xi V 2:2 V X3) in terms of 
"Protected OR", which corresponds to our relation M. 

{xi V X2V X3) = . . . 2/5 (a;i V yi) A (x2 V 2/2) A {X3 V ys) A {X3 V ^4) 

AA/(2/i, 2/5, ys) A M (y2, ys, Va) (3.1) 

The table below listing the witnesses of each assignment for (xi, a;2, 3:3), shows that the 
conditions for faithful expressibility are satisfied. 
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From the relation {xi V 2:2 V 2:3) we derive the other 3-clauses by reverse resolution, for 
instance 

(ii V 2:2 V X3) = 3y (xi W y) A {y \J X2 y X3) 

□ 

To complete the proof of the Faithful Expressibility Theorem, we show that an arbitrary 
relation can be expressed faithfully from 53. 

Lemma 3.5. Let R C {0, 1}'"' be any relation ofarity k > 1. R is faithfully expressible 
from S3. 

Proof. If fc < 3 then R can be expressed as a formula in CNF(53) with constants, 
without introducing witness variables. This kind of expression is always faithful. 
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If fc > 4 then R can be expressed as a formula in CNF(iSfc), without witnesses (i.e. 
faithfully). We will show that every A;-clause can be expressed faithfully from S^-i- Then, 
by induction, it can be expressed faithfully from 53. For simplicity we express a fc-clause 
corresponding to the relation Dq. The remaining relations are expressed equivalently. We 
express Dq in a way that is standard in other complexity reductions, and turns out to be 
faithful: 

(xi V a;2 V ■ • ■ V Xfc) = Ely (xi V 2:2 V y) A (y V 2:3 V ■ • ■ V Xk)- 

This is the reverse operation of resolution. For any satisfying assignment for x, its witness 
space is either {0}, {1} or {0, 1}, so in all cases it is connected. Furthermore, the only way 
two neighboring satisfying assignments for x can have no common witness is if one of them 
has witness set {0}, and the other one has witness set {1}. This implies that the first one 
has {x3, . . . , Xk) = (0, . . . , 0), and the other one has {xi,X2) = (0, 0), thus they differ in 
the assignments of at least two variables: one from {xi, X2} and one from {x3, . . . , Xk}. 
In that case they cannot be neighboring assignments. Therefore all requirements of faithful 
expressibility are satisfied. □ 

3.3. Hardness Results for 3-CNF formulas. From Lemma 13.41 and Corollary 13.31 it 
follows that, to prove the hard side of our dichotomy theorems, it suffices to focus on 3- 
CNF formulas. 

The proof that Conn(53) and ST-Conn(53) are PSPACE-complete is fairly intricate; it 
entails a direct reduction from the computation of a space-bounded Turing machine. The 
result for ST-CONN can also be proved easily using results of Hearne and Demaine on 
Non-deterministic Constraint Logic 1,1 IJ . However, it does not appear that completeness 
for Conn follows from their results. 

Lemma 3.6. ST-Conn(53) and Conn(53) are PSPACE-complete. 

Proof. Given a CNF(iS3) formula ip and satisfying assignments s, t we can check if they 
are connected in G{(p) with polynomial amount of space. Similarly for C0NN(iS3), by reusing 
space we can check for all pairs of assignments whether they are satisfying and, if they both 
are, whether they are connected in G{(p). It follows that both problems are in PSPACE. 

Next we show that C0NN(iS3) and ST-C0NN(53) are PSPACE-hard. Consider the fol- 
lowing known PSPACE-complete problem: Given a deterministic Turing machine M = 
{Q, E, r, 6, qa, (/accept, ^reject) and n in unary, will M accept the string consisting of n blanks, 
without ever leaving its n tape squares? We give a polynomial time reduction from this prob- 
lem to st-Conn(53) and Conn(53). 

The reduction maps a machine M and integer n (without loss of generality, assuming 
that 71 is at least as large as the description of M) to a 3-CNF formula (p and two satisfying 
assignments for the formula, which are connected in G{ip) if and only if M accepts. Further- 
more, all satisfying assignments of ip are connected to one of these two assignments, so that 
G{ip) is connected if and only if M accepts w. 

Before we show how to construct ip, we modify M in several ways: 

1. We add a clock that counts from to n x \Q\ x |r|" = 2^^"\ which is the total 
number of possible distinct configurations of M. It uses a separate tape of length 
0{n) with the alphabet {0, 1}. Before a transition happens, control is passed on to 
the clock, its counter is incremented, and finally the transition is completed. 

2. We define a standard accepting configuration. Whenever gaccopt is reached, the clock 
is stopped and set to zero, the original tape is erased and the head is placed in the 
initial position, always in state (/accept- 

3. Whenever grcjcct is reached the machine goes into its initial configuration. The tape 
is erased, the clock is set to zero, the head is placed in the initial position, and the 
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State is set to qo (and thus the computation resumes). 
4. Whenever the clock overflows, the machine goes into (jrcjcct- 

The new machine M' runs forever if M does not accept (rejects or loops), and accepts if 
M accepts. It also has the property that every configuration leads either to the accepting con- 
figuration or to the initial configuration. Therefore the space of configurations is connected if 
and only if M accepts. Let's denote by Q' the states of M' and by S' its transitions. M' runs 
on two tapes, the main one of size N and the clock of size Nc, both 0{n). The alphabet of 
M' on one tape is F, and on the other {0,1}. For simplicity we can also assume that at each 
transition the machine uses only one of the two tapes. 

Next, we construct an intermediate CNF-formula tp whose solutions are the configura- 
tions of M' . However, the space of solutions of is disconnected. 

For each i £ [N] and a € F, we have a variable x{i,a). If x{i,a) = 1, this means that 
the i*^ tape cell contains symbol a. For every i e [N] there is a variable y{i) which is 1 if the 
head is at position i. For every q £ Q', there is a variable z{q) which is 1 if the current state 
is q. Similarly for every j G [Nc] and a £ {0, 1} we have variables Xc{j, a) and a variable 
ydj) which is 1 if the head of the clock tape is at position j. 

We enforce the following conditions: 

1 . Every cell contains some symbol: 

V'l = A (^"^"sr x{i,a)) /\ (v„e{o,i} a;c(j,a)). 

2. No cell contains two symbols: 

^2= A A {x{i,a)Wx{i,a')^ A (a;c(j,0) Vxc(j,l)) . 

ielN]a^a'er je[N,] 

3. The head is in some position, the clock head is in some position, and the machine is 
in some state: 

i>3 = (v,e[Ar] y{i)) A (Vje[Ar,] vdj)) /\ i'^qeQi z{q)) . 

4. The main tape head is in a unique position, the clock head is in a unique position, 
and the machine is in a unique state: 

V'4= A (y(0v^) A (^v^) A (^vi(7))- 

i7ti'e[N] j7^j'e[Nc] qT^q'eQ' 

Solutions of ■0 = '01 A '02 A V'3 A V'4 are in 1-1 correspondence with configurations of 
M' . Furthermore, the assignments corresponding to any two distinct configurations differ in 
at least two variables (hence the space of solutions is totally disconnected). 

Next, to connect the solution space along vaUd transitions of M', we relax conditions 2 
and 4 by introducing new transition variables, which allow the head to have two states or a 
cell to have two symbols at the same time. This allows us to go from one configuration to the 
next. 

Consider a transition S{q, a) = {q' , b, R), which operates on the first tape, for example. 
Fix the position of the head of the first tape to be i, and the symbol in position i + 1 to be 
c. The variables that are changed by the transition are: x{i, a), y{i), z{q), x{i, 6), y{i + 1), 
z{q'). Before the transition the first three are set to 1, the second three are set to 0, and after 
the transition they are all flipped. Corresponding to this transition (which is specified by i, q, 
a, and c) we introduce a transition variable t{i, q, a, c). We now relax conditions 2 and 4 as 
follows: 
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Replace (^x{i, a) V x{i, 6)^ by (^x{i, a) V x{i, b) V t{i, q, a, c)j . 
Replace V y(i + 1)) by (^} V y{i + 1) V q, a, c)) . 



• Replace (^z{q) V by (^z{q) V z(g') V q, a, c)^ 

This is done for every value of q, a, i and c (and also for transitions acting on the clock 
tape). We add the transition variables to the corresponding clauses so that for example the 

clause (^x{i, a) V x{i, bfj could potentially become very long, such as: 

(^x{i, a) V x{i, b) V t{i, qi,a, ci) V t{i, q2, a, C2) V 

However, the total number of transition variables is only polynomial in n. We also add a 
constraint for every pair of transition variables t(i, g, a, c), t{i\ q\ a', c'), saying they cannot 
be 1 simultaneously: {t{i, q, a, c) V t{i' , g', a', c')). This ensures that only one transition can 
be happening at any time. The effect of adding the transition variables to the clauses of ip2 
and ip4 is that by setting t{i, q, a, c) to 1, we can simultaneously set x{i, a) and x{i, b) to 
1, and so on. This gives a path from the initial configuration to the final configuration as 
follows: Set g, a, c) — 1, set x{i^ b) = 1, y{i + 1) = 1, z{q') = 1, x{i, a) — 0, y{i) = 0, 
z{q) = 0, then set t{i, q, a, c) — 0. Thus consecutive configurations are now connected. To 
avoid connecting to other configurations, we also add an expression to ensure that these are 
the only assignments the 6 variables can take when t{i, q, a, c) — 1: 



^t,q,a,c = t{i, q, a, c) V a),y{i), z{q), x{i, b)),y{i + 1), z{q)) £ 

{111000,111100,111110,111111,011111,001111,000111}). 

This expression can of course be written in conjunctive normal form. 

Call the resulting CNF formula (p{'K, Xc, y, yc, z, t). Note that (p{x, Xc, y, yc, z, 0) = 
Xc, y, yc, z), so a solution where all transition variables are corresponds to a configu- 
ration of M' . To see that we have not introduced any shortcut between configurations that are 
not valid machine transitions, notice that in any solution of ip, at most a single transition vari- 
able can be 1. Therefore none of the transitional solutions belonging to different transitions 
can be adjacent. Furthermore, out of the solutions that have a transition variable set to 1, only 
the first and the last correspond to a valid configuration. Therefore none of the intermediate 
solutions can be adjacent to a solution with all transition variables set to 0. 

The formula ip is a CNF formula where clause size is unbounded. We use the same 
reduction as in the proof of Lemma |3.5| to get a 3-CNF formula. By Lemma [J!2] and Corollary 
|331 ST-CONN and Conn for 53 are PSPACE-complete. □ 

By Lemma [33] and Corollarv 13.31 this completes the proof of the hardness part of the 
dichotomies for Conn and st- Conn (Theorems 12 . 8 1 and |Z9]i . 

Finally, we show that 3-CNF formulas can have exponential diameter, by inductively 
constructing a path of length at least 2 ? on n variables and then identifying it with the solution 
space of a 3-CNF formula with 0{n^) clauses. By Lemma [34l and Corollarv l3.3l this implies 
the hardness part of the diameter dichotomy (Theorem l2.10b . 

Lemma 3.7. For n even, there is a 3-CNF formula ipn with n variables and 0{n^) 
clauses, such that G{(pn) is a path of length greater than 2 2". 

Proof. The construction is in two steps: we first exhibit an induced subgraph Gn of the n 
dimensional hypercube with large diameter. We then construct a 3-CNF formula ipn so that 

Gn = G{(fn)- 

The graph Gn is a path of length 2^ . We construct it using induction. For n — 2, we 
take V{G2) — {(0, 0), (0, 1), (1, 1)} which has diameter 2. Assume that we have constructed 
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Gn-2 with 2^3- vertices, and with distinguished vertices Sn-2, tn-2 such that the shortest 
path from s to t in Gn-2 has length 2^~. We now describe the set V{Gn)- For each vertex 
V e V{Gn-2), V{Gn) contains two vertices (v, 0, 0) and (v, 1, 1). Note that the subgraph 
induced by these vertices alone consists of two disconnected copies of G„_2- To connect 
these two components, we add the vertex m = (t,0, 1) (which is connected to (t,0, 0) 
and (t, 1,1) in the induced subgraph). Note that the resulting graph G„ is connected, but 
any path from (u, 0, 0) to (v, 1, 1) must pass through m. Further note that by induction, 
the graph G„ is also a path. The vertices Sn = (sn_2,0,0) and t„ = (sn_2,l,l) are 

,1 — 2 n 

diametrically opposite ends of this path. The path length is at least 2 ■ 2^2~ +2 > 2^. 
Also S2 = (0,0), s„ = (s„_2,0,0), t„ = (sn-2, 1, 1) and hence s„ = (0, . . . ,0),t„ = 
(0,..., 0,1,1). 

We construct a sequence of 3-CNF formulas (pn{xi, . . . , a;„) so that G„ = G{(pn)- Let 
(p2ixi,X2) =xiVx2- Assume we have iy9„_2(a;i, 2;„_2)- We add two variables a;„_i 
and Xn and the clauses 

'Pn-2ixi, ■ ■ ■ ,Xn-2), S„-l A X„ 

Xn^l \/ Xn \/ Xi for i < TT, — 4 (3.2) 

Xn-1 y Xn^ Xi for i = n — 3,n — 2 (3.3) 

Note that a clause in 13.21 is just the implication {xn-i A a;„) Xi. Thus clauses l372l 
13.31 enforce the condition that Xn-i = 0, a;„ = 1 implies that [xi, . . . ,x„_2) = tn_2 = 
(0,...,0, !,!).□ 

4. The Easy Case of the Dichotomy: Tight Sets of Relations. 

4.1. Schaefer sets of relations. We begin by showing that all Schaefer sets of rela- 
tions are tight. Schaefer relations are characterized by closure properties. We say that a 
r-ary relation R is closed under some /c-ary operation a : {0, 1}'' {0, 1} if for every 
a-*-, s? , . . . , € R, the tuple {a{a\, af , . . . , aj), . . . , a{al, . . . , aj;)) is in R. We denote this 
tuple by a{a-^, . . . , a''). 

We will use the following lemma about closure properties on several occassions. 

Lemma 4. 1 . If a logical relation R is closed under an operation a : {0, l}*^ — > {0, 1} 
such that !,...,!) — 1 and a(0, 0, . . . , 0) — (a.k.a. an idempotent operation) then 
every connected component ofG{R) is closed under a. 

Proof. Consider a-*^ , . . . , a'' G R, such that they all belong to the same connected 
component of G{R). It suffices to prove that a = a{a.^, . . . ,a^) is in the same con- 
nected component of G{R). To that end we will first prove that for any s, t e i? if there 
is a path from s to t in G{R) then there is a path from a{h^, . . . , W^, s, b'+-'^, . . . , b"^) 
to a(b-'^, . . . , b'^^, t, b'^-*^, . . . , b'') for any b^, . . . , b"^ e R. This observation implies 
that there is a path from a-*^ — a{a^, a^, . . . , a-*^) to a(a^, a'^, a-*^, . . . , a^), from there to 
a(a^, a^, a^, a^, . . . , a^) and so on, to Q;(a^, a^, . . . , a'') = a. Thus a is in the same con- 
nected component of G(i?) as a^. 

Let the path from s to t be s = s-*^ ^ s'^ ^ . . . s™ = t. For every j G {1, 2, . . . , m — 
1}, the tuples a{h^, . . . , b'-^, sJ, b'+i, . . . , b™) and a(b\ . . . , b'"i, sJ+i, b'+i, . . . , b™) 
differ in at most one position (the position in which and sj"*"-*^ are different) therefore 
they belong to the same component of G{R). Thus Q;(b^, . . . , b'^-*^, s-*^, b'^-*^, . . . , b™) and 
a(b^, . . . , b'~^, s™, b'"*'-'^, . . . , b™) belong to the same component. □ 

We are ready to prove that all Schaefer relations are tight. 

Lemma 4.2. Let R be a logical relation. 
1. If R is bijunctive, then R is componentwise bijunctive. 
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2. If R is Horn, then R is OR-/ree. 

3. If R is dual Horn, then R is NAND-/ree. 

4. IfR is qffine, then R is componentwise bijunctive, OK-free, and NAND-/ree. 
Proof. The case of bijunctive relations follows immediately from Lemma 143] and the 

fact that a relation is bijunctive if and only if it is closed under the ternary majority operation 
maj, which is idempotent. 

The cases of Horn and dual Horn are symmetric. Suppose a r-ary Horn relation R is not 
OR-free. Then there exist i,j G {1, . . . , r} and constants ti, . . . ,tr G {0,1} such that the 
relation R{ti, . . . , ti_i, x, ii+i, . . . , tj~i, y, tj+i, ■■■ ,tr) on variables x and y is equivalent 
to x\/ y, i.e. 

R{ti, . . . , ti^i,x,ti+i, . . . , tj^i,y,tj+i, . . . ,tr) = {01, 11, 10}. 

Thus the tuples t°°, t°^t^°, t" defined by {tf, tf) = (a, b) and tf = for every k ^ 
{i, j}, where a, 6, G {0, 1} satisfy t-*^", t-*^-*^, t"-*^ G R and t"" ^ R. However, since every 
Horn relation is closed under A, it follows that t"^ A t-*^" = t"" must be in R, which is a 
contradiction. 

For the affine case, a small modification of the last step of the above argument shows that 
an affine relation also is OR-free; therefore, dually, it is also NAND-free. Namely, since a 
relation R is affine if and only if it is closed under ternary 0, it follows that t"^ ©t^^ ®t^^ = 
t°° must be in R. 

Since the connected components of an affine relation are both OR-free and NAND-free 
the subgraphs that they induce are hypercubes, which are also bijunctive relations. Therefore 
an affine relation is also componentwise bijunctive. □ 

These containments are proper. For instance, i?i/3 = {100, 010, 001} is componentwise 
bijunctive, but not bijunctive as maj(100, 010, 001) = 000 ^ Ri/z- 

4.2. Structural properties of tight sets of relations. In this section, we explore some 
structural properties of the solution graphs of tight sets of relations. These properties provide 
simple algorithms for Conn(5) and ST-Conn(iS) for tight sets S, and also guarantee that 
for such sets, the diameter of G{tf) of CNF(iS)-formula if is linear 

Lemma 4.3. Let S be a set of componentwise bijunctive relations and f) a CNF(5)- 
formula. If Sl and b are two solutions of (p that lie in the same component of G{ip), then 
d^(a, b) = |a - b|. 

Proof. Consider first the special case in which every relation in S is bijunctive. In this 
case, (f is equivalent to a 2-CNF formula and so the space of solutions of if is closed under 
majority. We show that there is a path in G{(f) from a to b, such that along the path only the 
assignments on variables with indices from the set D — {i\a.i ^ bi} change. This implies 
that the shortest path is of length ji^l by induction on \D\. Consider any path a — > u-*^ ^ 
• ■ • — > u"" — > b in G{(f). We construct another path by replacing u' by v' — maj (a, u', b) 
for i — 1, . . . , r, and removing repetitions. This is a path because for any i v' and v'+^ differ 
in at most one variable. Furthermore, v' agrees with a and b for every i for which Ui = bi. 
Therefore, along this path only variables in D are flipped. 

For the general case, we show that every component F of G{(f) is the solution space of 
a 2-CNF formula f'. Let F be the component of G{if) which contains a and b. Let R € S 
be a relation with two components, Ri, i?2 each of which are bijunctive. Consider a clause 
in ip of the form R{xi, . . . ,Xk). The projection of F onto xi, . . . , Xfe is itself connected and 
must satisfy R. Hence it lies within one of the two components Ri, R2, assume it is We 
replace R{xi, . . . ,Xk) hy Ri{xi, . . . ,Xk). Call this new formula fi. G{ipi) consists of all 
components of G{(f) whose projection on xi, . . . ,Xk lies in Ri. We repeat this for every 
clause. Finally we are left with a formula f' over a set of bijunctive relations. Hence if' is 
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bijunctive and G{(p') is a component of G{(p). So the claim follows from the bijunctive case. 
□ 

Corollary 4.4. Let S be a set of componentwise bijunctive relations. Then 

1. For every Lp g CNF(iS) with n variables, the diameter of each component of G(if) 
is bounded by n. 

2. ST-C0NN(5) is in P. 

3. Conn(5) is in coNP. 

Proof. The bound on diameter is an immediate consequence of Lemma l43] 
The following algorithm solves ST-Conn(5) given vertices s,t G G{if). Start with 
u = s. At each step, find a variable Xi so that Ui ^ ti and flip it, until we reach t. If at any 
stage no such variable exists, then declare that s and t are not connected. If the s and t are 
disconnected, the algorithm is bound to fail. So assume that they are connected. Correctness 
is proved by induction on d = | s — 1 1 . It is clear that the algorithm works when d—1. Assume 
that the algorithm works for d — 1. If s and t are connected and are distance d apart. Lemma 
I4.3l implies there is a path of length d between them in G{(p). In particular, the algorithm will 
find a variable Xi to flip. The resulting assignment is at distance d — 1 from t, so now we 
proceed by induction. 

Next we prove that Conn(iS) G coNP. A short certificate that the graph is not connected 
is a pair of assignments s and t which are solutions from different components. To verify that 
they are disconnected it suffices to run the algorithm for ST-CONN. □ 

We consider sets of OR-free relations. Define the coordinate-wise partial order < on 
Boolean vectors as follows: a < b if a; < bi, for each i. 

Lemma 4.5. Let S be a set of OR-free relations and ip a CH¥{S)-formula. Every 
component of G{(p) contains a minimum solution with respect to the coordinate-wise order; 
moreover, every solution is connected to the minimum solution in the same component via a 
monotone path. 

Proof. We call a satisfying assignment locally minimal, if it has no neighboring satisfying 
assignments that are smaller than it. We will show that there is exactly one such assignment 
in each component of G{ip). 

Suppose there are two distinct locally minimal assignments u and u' in some compo- 
nent of G{(f). Consider the path between them where the maximum Hamming weight of 
assignments on the path is minimized. If there are many such paths, pick one where the 
smallest number of assignments have the maximum Hamming weight. Denote this path by 
u = ^ ^ • ■ • ^ u'' = u'. Let u' be an assignment of largest Hamming weight in the 
path. Then u' 7^ u and u' ^ u', since u and u' are locally minimal. The assignments u'^^ 
and u'+-'- differ in exactly 2 variables, say, in xi and X2. So {up^W2~^, u\u\^ u^^^^Uj^^} = 
{01, 11, 10}. Let u be such that ui = U2 = 0, and Ui — Ui for i > 2. If u is a solution, then 
the path — » — » ■ • • ^ u' — > u ^ u'+^ ^ ■ • • — > u'' contradicts the way we chose the 
original path. Therefore, u is not a solution. This means that there is a clause that is violated 
by it, but is satisfied by u'^^, u', and u'+^. So the relation corresponding to that clause is 
not OR-free, which is a contradiction. 

The unique locally minimal solution in a component is its minimum solution, because 
starting from any other assignment in the component, it is possible to keep moving to neigh- 
bors that are smaller, and the only time it becomes impossible to find such a neighbor is 
when the locally minimal solution is reached. Therefore, there is a monotone path from any 
satisfying assignment to the minimum in that component. □ 

Corollary 4.6. Let S bea set of OR-free relations. Then 
L For every G CNF(iS) with n variables, the diameter of each component of G{lp) 
is bounded by 2n. 
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2. ST- Conn (5) is in P. 

3. Conn(5) is in coNP. 

Proof. Given solutions s and t in the same component of G{tf), there is a monotone path 
from each to the minimal solution u in the component. This gives a path from s to t of length 
at most 2n. To check if s and t are connected, we just check that the minimal assignments 
reached from s and t are the same. □ 

Sets of NAND-free relations are handled dually to OR-free relations. In this case there is 
a maximum solution in every connected component of G(0) and every solution is connected 
to it via a monotone path. Finally, putting everything together, we complete the proofs of all 
our dichotomy theorems. 

Corollary 4.7. Let S be a tight set of relations. Then 

1. For every ip G CNF(iS) with n variables, the diameter of each component of G{lp) 
is bounded by 2n. 

2. ST- Conn (5) is in P. 

3. C0NN(5) is in coNP. 

4.3. The Complexity of Conn for Tight Sets of Relations. We pinpoint the complex- 
ity of Conn(iS) for the tight cases which are not Schaefer, using a result of Juban 1 12|. 

Lemma 4.8. For S tight, but not Schaefer, Conn(iS) is coNP-complete. 

Proof. The problem Another-S AT(5) is: given a formula ip in CNF(5) and a solution 
s, does there exist a solution t ^ s? Juban ( lfT2l . Theorem 2) shows that if S is not Schaefer, 
then Another-Sat is NP-complete. He also shows (112, Corollary 1) that if S is not 
Schaefer, then the relation x y is expressible from S through substitutions. 

Since S is not Schaefer, ANOTHER- Sat (5) is NP-complete. Let ip, s be an instance of 
Another-Sat on variables xi, . . . , Xn- We define a CNF(tS) formula t/j on the variables 
xi, ...,Xn,yi,.. . ,y„ as 

Ipixi, ...,Xn,yi,...,yn)= fixi, ...,Xn)/\i {Xi 7^ 

It is easy to see that G{ijj) is connected if and only if s is the unique solution to (p. □ 

We are left with the task to determine the complexity of Conn(iS) for the case when S 
is a Schaefer set of relations. In Lemmas l4.9l and l4. 101 we show that CONN(iS) is in P if S is 
affine or bijunctive. This leaves the case of Horn and dual Horn, which we discuss in the end 
of this section. 

Lemma 4.9. IfS is a bijunctive set of relations then there is a polynomial time algorithm 
for C0NN(5). 

Proof. Consider a formula (j>{xi, . . . ,a;„) in CNF(iS). Since 5 is a bijunctive set of 
relations can be written as a 2-CNF formula. Since satisfiability of 2-CNF formulas is 
decidable in polynomial time, it is easy to decide for a given variable Xi whether there exist 
solutions in which it takes a particular value in {0, 1}. The variables which can only take 
one value are assigned that value. Without loss of generality we can assume that the resulting 
2-CNF formula is ip{xi, . . . ,x„i). 

Consider the graph of implications of i/' defined in the following way: the vertices are 
the literals xi, . . . ,Xm, xi, . . . ,Xm- There is a directed edge from literal 1 1 to literal I2 if and 
only if tp contains a clause containing I2 and the negation of li, which we denote by li (if 
li is a negated variable x, then li denotes x). The directed edge represents the fact that in a 
satisfying assignment if the literal li is assigned true, then the literal I2 is also assigned true. 
We will show that G'(?/') is disconnected if and only if the graph of implications contains a 
directed cycle. This property can be checked in polynomial time. 

Suppose the graph of implications contains a directed cycle of literals h h ^ h ^ 
• ■ • — > Zj, — > ^j^. By the construction, the graph also contains a directed cycle on the negations 
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of these literals, but in the opposite direction: If^ — > lf^_i ■ ■ ■ ^ I2 ^ h Ik- There 
is a satisfying assignment s in which li is assigned 1, and also a satisfying assignment t in 
which li is assigned 1. By the implications, in s the literals li,l2, ■ ■ ■ ,lk are assigned 1, and 
in t ^1, Z27 ■ ■ • , ^fc are assigned 1. Suppose there is a path from s to t. Then let li be the first 
literal in the cycle whose value changes along the path from s to t. Then there is a satisfying 
assignment in which k is assigned whereas all other literals on the cycle are assigned 1 . On 
the other hand, this cannot be a satisfying assignment because the edge {k-i, U) implies that 
there is a clause containing only U and the negation of U-i, and this clause is violated by the 
assignment. This is a contradiction, therefore there can be no path from s to t. 

Next, suppose the graph of implications contains no directed cycle, and G(V') is discon- 
nected. Let s and t be satisfying assignments from different connected components of G{il}) 
that are at minimum Hamming distance. Let U be the set of variables on which s and t differ. 
There are two literals corresponding to each variable, and let and [/* denote respectively 
the Uterals that are true in s and in t. The directed graph induced by in the implications 
graph contains no directed cycle, therefore there exists a literal / e without an incoming 
edge from a literal in U^. There is also no incoming edge from any other true literal in s, 
because t is also satisfying. Thus the value of the corresponding variable can be flipped and 
the resulting assignment is still satisfying. This assignment is in the same component as s but 
it is closer to t which contradicts our choice of s and t. □ 

Lemma 4. 10. If S is an affine set of relations then there is a polynomial time algorithm 
for Conn(5). 

Proof. An affine formula can be described as the set of solutions of a linear system of 
equations. For any solution, if only a variable that appears in at least one of the equations 
is flipped, the resulting assignment is not a solution. Therefore it suffices to check whether 
the system has more than one solution (after variables that don't appear in any equation are 
removed), which is easy by checking the rank of the matrix obtained from the Gaussian 
elimination algorithm. □ 

We are left with characterizing the complexity of CONN for sets of Horn relations and 
for sets of dual Horn relations. In the conference version |TQ) of the present paper, we had 
conjectured that if S is Horn or dual Horn, then Conn(iS) is in P, but this was disproved 
by Makino, Tamaki and Yamamoto 1 17|. They showed that CONN({i?2}) is coNP-complete, 
where R2 — {0, 1}^\{110}, hence there exist Horn (and by symmetry also dual Horn) sets of 
relations for which CONN is coNP-complete. Their proof is via a reduction from POSITIVE 
Not- All-Equal 3-Sat, which as seen earlier is Sat({_Rnae}), where i?NAE = {0, 1}^ \ 
{000, 111}. This problem is also known as 3-Hypergraph 2-colorability, 

The relation R2 is a 3-clause with one positive literal. We will describe a natural set of 
Horn relations first introduced in fSl, which cannot be used to express i?2. We show that for 
this set there is a polynomial time algorithm for CONN. 

Definition 4.11. A logical relation R is implicative hitting set-bounded— or IHSB— 
if it is the set of solutions of a Horn formula in which all clauses of size greater than 2 have 
only negative literals. Similarly, R is implicative hitting set-bounded+ or 1HSB+ if it is the 
set of solutions of a dual Horn formula in which all clauses of size greater than 2 have only 
positive literals. 

These types of logical relations can be characterized by closure properties. A relation R 
is IHSB— if and only if it is closed under a A (b V c); in other words if a, b, c G R, where R 
is of ai-ity r, then a A (b V c) = (ai A (61 V ci) , 02 A (62 V C2) , . . . , A {hr V Cr)) G R. 
A relation R is 1HSB+ if and only if it is closed under a V (b A c). While the definition 
may at first look unnatural, it comes from Post's classification of Boolean functions (see 
lIU). One of the consequences of this classification is that IHSB— relations cannot express 
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all Horn relations, and in particular i?2, even in the sense of Schaefer's expressibility. For 
the purposes of faithful expressibility we can define an even larger class of relations which 
cannot faithfully express R2 (unless P = coNP). 

Definition 4.12. A logical relation R is componentwise IHSB- (IHSB+) if every 
connected component of G{R) is IHSB— (IHSB+). 

By Lemma 14711 every relation that is IHSB— (1HSB+) is also componentwise IHSB— 
(IHSB+). Of course, the class of componentwise IHSB— relations is much broader, and in 
fact includes relations that are not even Horn, such as Ri ^3, However in the following lemma 
we are only considering componentwise IHSB— (1HSB+) relations which are Horn (dual 
Horn). We will say that a set of relations S is componentwise IHSB— (1HSB+) if every 
relation in S is componentwise IHSB— (1HSB+). 

Lemma 4. 13. If S is a set of relations that are Horn (dual Horn) and componentwise 
IHSB— (IHSB+), then there is a polynomial time algorithm for Conn(iS). 

Proof. First we consider the case in which every relation in S is IHSB—. The formula 
can be written as a conjunction of Horn clauses, such that clauses of length greater than 2 
have only negative literals. Let all unit clauses be assigned and propagated — their variables 
take the same value in all satisfying assignments. The resulting formula is also IHSB—, and 
has two kinds of clauses: 2-clauses with one positive and one negative literal, and clauses of 
size 2 or more with only negative literals. The assignment of zero to all variables is satisfying. 
There is more than one connected component if and only if there is another assignment that 
is locally minimal by Lemma |43] A locally minimal satisfying assignment is such that if any 
of the variables assigned 1 is changed to the resulting assignment is not satisfying. Thus 
all variables assigned 1 appear in at least one 2-clause with one positive and one negative 
literal for which both variables are assigned 1 . We say that such an assignment certifies the 
disconnectivity. 

To describe the algorithm, we first define the following implication graph G. The vertices 
are the set of variables. There is a directed edge {xi,Xj) if and only if {xj V Xi) is a clause 
in the IHSB— representation. Let Si, ... , Sm be the sets of variables in clauses with only 
negative literals. For every variable Xi let denote the set of variables reachable from Xi in 
the directed graph. Note that if Xi is set to 1, then every variable in must also be set to 1. 
The algorithm rejects if and only if there exists a variable Xi such that Xi G Ti and T,; does 
not contain Sj for any j G {1, . . . , to}. We show that this happens if and only if the solution 
graph is disconnected. Note that the algorithm runs in polynomial time. 

Assume that the graph of solutions is disconnected and consider the satisfying assign- 
ment s that certifies disconnectivity. Let U be the set of variables Xi such that Si — 1. Since 
every variable in U appears in at least one 2-clause for which both variables are from U, the 
directed graph induced by U is such that every vertex has an incoming edge. By starting at 
any vertex in U and following the incoming edge backwards until we repeat some vertex, we 
find a cycle in the subgraph induced by U. For any variable Xi in such a cycle it holds that 
Xi e Ti. Further T,; C U, since setting Xi to 1 forces all variables in Ti to be 1. Also Ti 
cannot contain Sj for any j, else the corresponding clause would not be satisfied by s. Thus 
the algorithm rejects whenever the solution graph is disconnected. 

Conversely, if the algorithm rejects, there exists a variable Xi such that Xi S Ti and Tj 
does not contain Sj for any j G {1, . . . , to}. Consider the assignment in which all variables 
from Ti are assigned 1, and the rest are assigned 0. We will show that this assignment is satis- 
fying and it is a certificate for disconnectivity. Clauses which contain only negated variables 
are satisfied since Sj (f. Ti for all j. Now consider a clause of the form {xj V x-^) and note 
that there is a directed edge (xfc, Xj). If x^ = 0, this is satisfied. If a;^ = 1 then e Ti, 
and hence Xj e Ti because of the edge (zfe, Xj). But then Xj is set to 1, so the clause is sat- 
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isfied. To show that this solution is minimal, consider trying to set G Ti to 0. There is an 
incoming edge (xj, xj.) for some Xj £ T^, and hence a clause (x^ V Xj), which will become 
unsatisfied if we set Xk = 0. Thus we have a certificate for the space being disconnected. 

Next, consider a formula 0(a;i, . . . , Xn) in CNF(iS). We reduce the connectivity ques- 
tion to one for a formula with IHSB— relations. Since satisfiability of Horn formulas is 
decidable in polynomial time and every connected component of a Horn relation is a Horn 
relation by Lemma 1431 it is easy to decide for a given clause and a given connected compo- 
nent of its corresponding relation (the relation obtained after identifying repeated variables), 
whether there exists a solution for which the variables in this clause are assigned a value in 
the specified connected component. If there exists a clause for which there is more than one 
connected component for which solutions exist, then the space of solutions is disconnected. 
This follows from the fact that the projection of G{4)) on the hypercube corresponding to the 
variables appearing in this clause is disconnected. Therefore we can assume that the relation 
corresponding to every clause has a single connected component. Since that component is 
IHSB- the relation itself is IHSB-. □ 

It is still open whether CONN is coNP-complete for every remaining Horn set of relations, 
i.e. every set of Horn relations that contains at least one relation that is not componentwise 
IHSB—. Following the same line of reasoning as in the proof of our Faithful Expressibil- 
ity Theorem we are able to show that one of the paths of length 4 defined in Section 13.21 
namely M(a:i, a;2, 2:3), can be expressed faithfully from every such set of relations. Thus the 
trichotomy would be established if one shows that CONN({M(a;i, a;2, xa)}) is coNP-hard. 

5. Discussion and Open Problems. In Section 2, we conjectured a trichotomy for 
Conn(iS). In view of the results established here, what remains is to pinpoint the com- 
plexity of Conn(iS) when S is Horn but not componentwise IHSB—, and when S is dual 
Horn but not componentwise IHSB+. 

We can extend our dichotomy theorem for si-connectivity to CNF(iS)-formulas without 
constants; the complexity of connectivity for CNF(tS)-formulas without constants is open. 
We conjecture that when S is not tight, one can improve the diameter bound from 2^'^^ 
to 2"("). Finally, we believe that our techniques can shed light on other connectivity-related 
problems, such as approximating the diameter and counting the number of components. 
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